AITopics | hessian-vector product

Collaborating Authors

hessian-vector product

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reducing Reparameterization Gradient Variance

Andrew Miller, Nick Foti, Alexander D'Amour, Ryan P. Adams

Neural Information Processing SystemsNov-21-2025, 06:43:07 GMT

Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparameterization gradients, or gradient estimates computed via the "reparameterization trick," represent a class of noisy gradients often used in Monte Carlo variational inference (MCVI). However, when these gradient estimators are too noisy, the optimization procedure can be slow or fail to converge. One way to reduce noise is to generate more samples for the gradient estimate, but this can be computationally expensive. Instead, we view the noisy gradient as a random variable, and form an inexpensive approximation of the generating procedure for the gradient sample. This approximation has high correlation with the noisy gradient by construction, making it a useful control variate for variance reduction. We demonstrate our approach on a non-conjugate hierarchical model and a Bayesian neural net where our method attained orders of magnitude (20-2,000) reduction in gradient variance resulting in faster and more stable optimization.

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Mingrui Liu, Zhe Li, Xiaoyu Wang, Jinfeng Yi, Tianbao Yang

Neural Information Processing SystemsNov-20-2025, 21:16:34 GMT

When the function is non-degenerate (i.e., strict saddle or the Hessian at all

artificial intelligence, complexity, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

NEON2: Finding Local Minima via First-Order Oracles

Zeyuan Allen-Zhu, Yuanzhi Li

Neural Information Processing SystemsNov-20-2025, 20:21:43 GMT

We propose a reduction for non-convex optimization that can (1) turn an stationary-point finding algorithm into an local-minimum finding one, and (2) replace the Hessian-vector product computations with only gradient computations. It works both in the stochastic and the deterministic settings, without hurting the algorithm's performance. As applications, our reduction turns Natasha2 into a first-order method without hurting its theoretical performance. It also converts SGD, GD, SCSG, and SVRG into algorithms finding approximate local minima, outperforming some best known results.

artificial intelligence, computation, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Mingrui Liu, Zhe Li, Xiaoyu Wang, Jinfeng Yi, Tianbao Yang

Neural Information Processing SystemsNov-19-2025, 01:37:32 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, complexity, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

NEON2: Finding Local Minima via First-Order Oracles

Zeyuan Allen-Zhu, Yuanzhi Li

Neural Information Processing SystemsNov-18-2025, 19:32:51 GMT

artificial intelligence, computation, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

A Algorithms

Neural Information Processing SystemsNov-16-2025, 06:09:05 GMT

Below we include detailed pseudocode for algorithms described in the main text.Algorithm 2 Parameter Free DeltaShift Input: Implicit matrix-vector multiplication access to A In this section, we give a full proof of Theorem 1.1 with the correct logarithmic dependence on Before doing so, we collect several definitions and results required for proving the theorem. As discussed, a tight analysis of Hutchinson's estimator, and also our DeltaShift algorithm, relies Finally, from Claim B.2, we immediately have Rademacher random vectors, a similar analysis can be performed for any i.i.d. Now, we are ready to move on to the main result. The proof is by induction. We claim that, for all j = 1,...,m, t Next consider the inductive case.

artificial intelligence, machine learning, matrix, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

iMAML algorithm perform better than MAML

Neural Information Processing SystemsNov-15-2025, 19:42:38 GMT

We thank the reviewers for the thoughtful feedback! Reviewer #1: Thank you for the thoughtful questions! We do not require convexity of L anywhere. Furthermore, regularity conditions are often needed for analysis but not to run the algorithm. Similarly, iMAML shows promising empirical results.

artificial intelligence, imaml algorithm perform, optimization problem, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.31)

Add feedback

GPU-Accelerated Primal Learning for Extremely Fast Large-Scale Classification: Supplementary Material

Neural Information Processing SystemsNov-13-2025, 22:11:23 GMT

Speedups were tested for both batch gradient descent (with a 0.001 learning rate) and L-BFGS . Let 1 denote the indicator function. TRON is detailed in Algorithm 1. The other direction is slightly different.

artificial intelligence, hessian-vector product, machine learning, (12 more...)

Neural Information Processing Systems

Country: